Clustering Categorical Data via Ensembling Dissimilarity Matrices
نویسندگان
چکیده
منابع مشابه
Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure
K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency...
متن کاملClustering in ordered dissimilarity data
This paper presents a new technique for clustering either object or relational data. First, the data are represented as a matrix D of dissimilarity values. D is reordered to D∗ using a visual assessment of cluster tendency algorithm. If the data contain clusters, they are suggested by visually apparent dark squares arrayed along the main diagonal of an image I (D∗) of D∗. The suggested clusters...
متن کاملClustering Categorical Data
Dynamical systems approach for clustering categorical data have been studied by some authors [1]. However, the proposed dynamic algorithm cannot guarantee convergence, so that the execution may get into an in nite loop even for very simple data. We de ne a new conguration updating algorithm for clustering categorical data sets. Let us consider a relational table with k elds, each of which can a...
متن کاملClustering categorical data streams
The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams becomes more difficult, because the data objects in a data stream must be accessed in order and can be read only once or few times with limited resources. Rec...
متن کاملClustering Large Categorical Data Set via Genetic Algorithms
Riassunto: Illustriamo due estensioni − proposte da Huang (1998) e da Kauffman e Rousseew (1990) − del noto algoritmo di raggruppamento a centri mobili al caso in cui si considerino variabili qualitative. Proponiamo una nuova metodologia di raggruppamento basata sulla logica degli algoritmi genetici. L’applicazione di tali algoritmi ad un data set reale e l’analisi dei risultati ottenuti eviden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational and Graphical Statistics
سال: 2017
ISSN: 1061-8600,1537-2715
DOI: 10.1080/10618600.2017.1305278